# -*- mode: org; -*-
#+HTML_HEAD: <link rel="stylesheet" type="text/css" href="http://www.pirilampo.org/styles/readtheorg/css/htmlize.css"/>
#+HTML_HEAD: <link rel="stylesheet" type="text/css" href="http://www.pirilampo.org/styles/readtheorg/css/readtheorg.css"/>
#+TITLE: Grep
   
* GREP
 /Dependencies:/ In order to learn grep this can be necessary for you to learn before:
    - Unix Pipes
    - Shell Basics
    - Data Streams
    - Unix-like Operative Systems 

** Simple description of Grep:
Grep is a UNIX command for searching text that matches a string.
If we want to look for "text" in $file then grep is exactly the tool we need! 
It can also be used with the UNIX pipeline ~|~

+ Before we start learning grep please create the following file named ~example_file.txt~
#+BEGIN_EXAMPLE 
example_file.txt
The Apple is red
The orange is old
The pear is tasty 
The date of the party
The dragon likes to eat fruit
I don't know what dewberry is
My berries are too old
for $FILE 
*bold*
\path\to\file
100.00
52.34
Up here^
[some stuff]
#+END_EXAMPLE


** Basic example

#+BEGIN_EXAMPLE  bash
$ grep orange example_file.txt
The orange is old
#+END_EXAMPLE


That simple command is the way to use grep: You call grep with two arguments, first one is the text you want to find, 
second is where you want to find it. Of course that is very simple, so we should look at better examples.

** Better examples

Now that we have seen grep in action is time to put it to better use, after all grep is one of the most powerful and useful commands in the UNIX world. 
First of all you should notice that grep is *case sensitive.*  This means that if you call ~$grep foo file.txt~  grep will only find  "foo" but not "Foo". 

+ Case insensitive search can be done with the flag ~-i~

#+BEGIN_EXAMPLE  bash
$ grep -i apple example_file.txt
The Apple is red
#+END_EXAMPLE

It is common in the UNIX world to have the need to look for meta characters  ~^$\.*[]~ 

+ Meta characters can be find with the special flag ~-F~

#+BEGIN_EXAMPLE  bash
$ grep -F '$FILE' example_file.txt 
for $FILE
#+END_EXAMPLE


+ Another common case might be the need for looking for two or more different strings. We can have different options 
here:
    - Match any of the strings: use ~-e~ flag

    #+BEGIN_EXAMPLE  bash
$ grep -e 'foo' -e 'bar'
    #+END_EXAMPLE


   - Match all of the strings: use a pipeline ~|~ to stream from one grep to another

#+BEGIN_EXAMPLE  bash
$ grep 'foo' | grep 'bar'
#+END_EXAMPLE

+ Sometimes you are looking for a pattern for example, you have a bunch of names file01.txt file02.txt and so on, in this case a period ~.~ can be used as a wildcard that matches exactly one character

#+BEGIN_EXAMPLE  bash
$ls
file01 file02 file03 file001
$ ls | grep file.1 
file01
#+END_EXAMPLE

-Notice that /file001/ was not matched, since ~.~ stands for only one character, if you need more characters just add more ~.~ one for each character

#+BEGIN_EXAMPLE  bash
$ls
file01 file02 file03 file001
$ ls | grep file..1 
file001
#+END_EXAMPLE

Now that we know how to use the widcard ~.~ you might be wondering what happens if you are looking for a file with a period like file.txt.

- In this case the correct command is ~grep 'file\.txt'~ using both ~'~ and ~\~ , otherwise grep will match also file1txt file-txt fileatxt and so on:

#+BEGIN_EXAMPLE  bash
$ls
file.txt file1txt file-txt fileatxt
$ ls | grep 'file\.txt '
file.txt
#+END_EXAMPLE

*** Table for grep commands
| Metacharacter | Function                                               | Example    | Descrpition                                                                                                                 |
|---------------+--------------------------------------------------------+------------+-----------------------------------------------------------------------------------------------------------------------------|
| ^             | Beginning-of-line anchor                               | ~'^Up'~     | Will display all lines beginning with ~Up~                                                                                  |
| $             | End-of-line anchor                                     | ~'old'$~   | Will display all lines ending with ~old~                                                                                    |
| .             | Matches single character                               | ~a..e~     | Will display lines containing ~a~ followed by /two/ characters, followed y an e                                             |
| *             | Matches zero or more characters preceding the asterisk | ~'too*'~   | Will display lines with 'to', or 'too'  because 'to' is a zero match character and 'too' is a one match character           |
| [ ]           | Matches single character in the set                    | ~'[Aa]pple'~  | Will display lines containing ~Apple~ or ~apple~                                                                            |
| [^]           | Matches single character not in the set                | ~'[^Tt]he'~   | Will display lines not containing a character ~T~ or ~t~ followed by ~he~ but it will display all the other lines with ~he~ |
| \<            | Beginning-of-word anchor                               | ~'\<date'~ | Will display lines containing a word that begins with "date"                                                                |
| \>            | End-of-word anchor                                     | ~'ear\>'~  | Will display lines containing a word that ends with "ear"                                                                   |



** What else can be done 
Cool stuff that you can do now with this new knowledge.


A pretty basic use of the pipeline with grep is explained next: 

#+BEGIN_EXAMPLE bash
$ grep -v "e$" example_file.txt | grep "^d"
#+END_EXAMPLE

The first command ~grep -v "e$" example_file.txt~ matches all lines ending in ~"e"~. The ~"-v"~ flag means /omit all 
matches,/ thus the matching ~"e$"~ lines don't show up on stdout.

(*Note* In UNIX ~$~ represents the end of a line coversely ~^~ is the beginning of a line *End Note*)

The result of this command on stdout is (as said before, this won't be shown to user):
    #+BEGIN_EXAMPLE   
pear
dragonfruit
dewberry
berries
#+END_EXAMPLE
Which is then piped into the second command, ~grep "^d"~. Just like how having ~"$"~ next to ~"e"~ meant /match e
when it's next to the end of the line/, having ~"^"~ next to ~"d"~ meants /match d when it's next to the start of the line./

So, the final ouput would be this:
    #+BEGIN_EXAMPLE  
dragonfruit
dewberry
#+END_EXAMPLE

** Exercise
exercise.txt:
exercise.sh:
    
#+BEGIN_EXAMPLE bash
if $(whereis wget)
then
wget komprendo.net/x/x/exercise.txt
else
curl komprendo.net/x/x/exercise.txt > exercise.txt
fi
#+END_EXAMPLE


** Resources 
info grep (better content than man grep)
- [[https://learnbyexample.gitbooks.io/command-line-text-processing/content/gnu_grep.html][Learn By Example - Grep]]
- [[http://www.panix.com/~elflord/unix/grep.html][Unix and Linux - Grep]]
- [[https://linuxjourney.com/lesson/grep-command][Linux Journey - Grep]]

